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ABSTRACT 

As part of the University of California's recent reconsideration of the role of the SAT in 
admissions, the UC Office of the President published an extensive report, UC and the 
SAT (2001), which examined the value of SAT I: Reasoning Test scores, SAT II: Subject 
Test scores, and high school grades in predicting the grade-point averages of UC 
freshmen (UCGPA), as well as the role of economic factors in predicting UCGPA. The 
analyses in UC and the SAT were based primarily on data that had been aggregated 
across freshmen cohorts (1996 through 1999) and across UC campuses. In the current 
study, by contrast, data were analyzed within campuses and cohorts and then 
summarized. While some of our conclusions are similar to those in UC and the SAT, 
others are not. Like the earlier study, for example, our reanalyses showed that, 
considered collectively, the SAT II tests required by UC (Writing, Math, and a third test of 
the applicant's choice) are slightly superior to the SAT I as a predictor of UCGPA. But 
our reanalyses also revealed considerable variability across campuses and freshman 
cohorts in the predictive value of high school grades and test scores, which was masked 
in the earlier analyses. Also, our reanalyses did not support the conclusion in UC and 
the SAT that SAT II scores are "less sensitive" to socioeconomic factors than SAT I 
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scores, an assertion that was often repeated during the SAT debate that took place in 
2001 and 2002. 



1. BACKGROUND 

In February, 2001 , Richard Atkinson, then the President of the University of California, 
made a surprise speech advocating the elimination of the SAT I: Reasoning Test as a 
criterion for admission to UC (Atkinson, 2001 ; see also Atkinson, 2004). The idea of 
abandoning the SAT at the University of California had been hotly debated several years 
earlier but had faded from public awareness. Atkinson's speech, in which he advocated 
an immediate switch to college admissions tests that are tied closely to the high school 
curriculum, sparked a wholesale reconsideration of the University's admissions criteria. 

Under UC policy, applicants must take either the SAT I: Reasoning Test or the ACT, as 
well as the SAT II Writing test, the SAT II Math test, 1 and a third SAT II test of their own 
choice. As part of the process of reexamining UC admissions policy, the Board on 
Admissions and Relations with Schools (BOARS), a University-wide faculty committee, 
commissioned an analysis of UC admissions data, to be carried out by researchers at 
the UC Office of the President (see Perry, Sawrey, & Brown, 2004). The result was an 
October 2001 report, UC and the SAT: Predictive Validity and Differential Impact of the 
SAT I and SAT II at the University of California, by Saul Geiser with Roger Studley. The 
study examined the value of SAT I: Reasoning Test scores, SAT II: Subject Test scores, 
and high school grades in predicting freshman grade-point average at the University of 
California (UCGPA) and also investigated the role of economic factors in predicting 
UCGPA. 

Two of the main conclusions of the study conducted by the UC Office of the President 
(UCOP) were the following: 

1 . If the purpose of admissions tests is to predict freshman grades, “then the SAT II 
is unquestionably superior to the SAT I ... according to the UC data” (page 7). 

2. “...SAT II achievement tests are not only a better predictor, but also a fairer test 
for use in college admissions insofar as they are demonstrably less sensitive 
than the SAT I to differences in socioeconomic and other background factors” 
(page 10). 2 

This study served a key role in discussions of California higher education policy: It was 
invoked to support President Atkinson’s contention that the SAT I should be abandoned. 
The results of the research were studied intensively during the year following Atkinson's 



1 There are actually two SAT II Math tests: Level I C, which is the usual test, and Level II C, a more 
advanced test taken by fewer students. The data set used in the current study did not indicate which of the 
two tests was taken. 

2 A later version of this report (Geiser & Studley, 2002; reprinted as Geiser & Studley, 2004) contained slight 
revisions of these conclusions. The first conclusion was rephrased to say that “the SAT II achievement tests 
are consistently better predictors of student success at UC than the SAT I, although the incremental gain in 
prediction is relatively modest and there is substantial redundancy across the tests.” The second conclusion 
was restated to say that “the predictive validity of the SAT II appears to be less conditioned by 
socioeconomic factors than is the SAT I” (Geiser & Studley, 2004, p. 125). 
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speech, as UC officials met regularly with the College Board. In early 2002, the College 
Board announced that it planned to alter the SAT I; the proposed changes were 
approved by the College Board Trustees in June 2002. The new SAT, scheduled to be 
in place by 2005, will substitute short reading items for the controversial verbal analogy 
items, incorporate more advanced math content, and add a writing section that contains 
both multiple-choice questions and an essay. These changes are expected to better 
align the test with the college preparatory courses UC applicants are required to take. 



2. GOALS OF THE CURRENT ANALYSES 

UC and the SAT provided invaluable information on the interrelationships among grades, 
test scores, and socioeconomic background for UC students. As explained in Sections 
2.1 - 2.2, however, the conclusions that can be drawn from the analyses are limited 
because of the degree to which data were aggregated. Analyses that are more fine- 
grained can reveal further information about these complex associations. In the present 
study, we sought to expand upon the UCOP study in two ways, outlined below. 

2. 1. Goall: Analysis of Campus and Cohort Effects 

Most of the UCOP analyses aggregated data over seven UC campuses, four freshman 
cohorts (1996 through 1999), or both (see Table 1A). Combining data from dissimilar 
groups of individuals can obscure relationships among variables or produce spurious 
evidence of such relationships. This phenomenon is known in statistical jargon as 
confounding of within-group effects with between-group effects. An example is the 
following: Suppose that there is no correlation between test scores and college grades 
at either Campus A or Campus B (i.e. , no within-school effect). At Campus B, however, 
both grades and test scores tend to be higher than at Campus A — a between-school 
effect. If the data from the two schools are combined and the correlation recalculated, 
there will appear to be a correlation between test scores and grades, but the association 
will be due entirely to the fact that, at Campus B, both grades and test scores are higher 
than at Campus A. More subtle manifestations of this phenomenon can and do occur. 
(Howell, 1997, pp. 267-268, gives an example based on actual data.) That is why test 
validity studies are typically conducted within schools. Results can be aggregated later if 
desired (e.g., Ramist, Lewis, & McCamley-Jenkins, 1994; Zwick, 1991 and 1993); this 
kind of pooled within-school analysis does not lead to the confounding phenomenon. 

Aggregating the data across cohorts may present particular problems. In a study of SAT 
validity sponsored by the UC Linguistic Minority Research Institute (Zwick & Schlemer, 
2004), a decision was made to conduct separate analyses of 1997 and 1998 applicants 
to UCSB after systematic differences between the two cohorts became evident. In 
particular, the correlations between test scores and freshman GPA varied substantially 
across cohorts, and these differences were especially evident in analyses of ethnic and 
language groups. This lack of consistency was probably due in part to the fact that the 
cohorts differed widely in the amount of missing data for language and ethnicity. In 
1997, ethnicity information was unavailable for 1.1% of freshmen, and language 
information was unavailable for 2.4%. In 1998, these percentages jumped to 14.5% and 
44.1%, respectively. A seemingly plausible explanation for these large differences is the 
fact that the 1998 entrants were the first cohort affected by California’s Proposition 209, 
which eliminated affirmative action in admissions to public institutions. Cohort effects 
can be expected in the data used for UC and the SAT for similar reasons. 
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The analyses of the prediction of freshman GPA for individual ethnic groups in UC and 
the SAT were also based on data that had been aggregated across cohorts and 
campuses. The test validity literature suggests that campus-level analyses that focus on 
ethnic groups can reveal important differences among these groups. For example, 
among 1998 freshmen at UCSB (Zwick & Schlemer, 2004), high school GPA predicted 
freshman GPA better than SAT I scores in most student groups, as is typical, but SAT I 
Verbal score was the best predictor for both Asian-Americans and Latinos who said 
English was not their best language. (SAT II scores were not considered.) Useful 
information can be obtained from the UCOP data by conducting further analyses of the 
degree to which prediction patterns vary across ethnic groups, within campuses and 
cohorts. 

2.2. Goal 2: Analysis of Scores on Individual Tests Instead of Test Score Composites 
In most of the UCOP analyses, “SAT I score” was a composite of SAT I Verbal and Math 
scores; “SAT II score” was a composite of SAT II Writing and Math scores and the score 
on the third SAT II test selected by the applicant. The SAT II composite is of particular 
concern because it does not have a consistent meaning for all applicants. The identity 
of the third SAT II test can make a substantial difference in the “behavior” of the SAT II 
composite. For example, research by the College Board and by UCOP suggests that 
SAT II language test scores behave quite differently from other SAT II scores 
(Bridgeman, Burton, & Cline, 2004; Geiser & Studley, 2001, October, pp. 14-15). 

To see how forming composite scores can obscure important information, consider a 
simple example from the most recent national data on college-bound seniors (College 
Board, 2003). The average SAT I composite score (as defined by UCOP) was slightly 
higher for Asian-Americans (1083) than for Whites (1063). By considering the SAT I test 
sections separately, however, we can see that Asian-Americans, who are much more 
likely to have taken precalculus and calculus than Whites, actually scored an average of 
41 points higher than Whites on the Math section and 21 points tower on the Verbal 
section. 

It is also important to note that the use of composites does not yield the same validity 
coefficients (in this case, correlations with freshman GPA) that would result if each 
component test were considered separately. For example, the correlation of the SAT I 
composite with freshman GPA is not, in general, the same as the multiple correlation 
that would be obtained if SAT I Verbal and Math scores were considered as two 
separate predictors of freshman GPA in a regression equation. The differences can be 
substantial. 3 In general, much more can be learned about the distinctions between the 
SAT I and the SAT II if each component score is examined individually. 



3. DATA 

The analyses conducted in the current study were based on two data sets supplied by 
the UC Office of the President in response to a formal request. The primary data set, 



3 Both kinds of validity coefficients can be obtained theoretically, given certain assumptions about 
correlations and variances. I constructed an example in which SAT I Math and Verbal scores had the same 
variance and had an intercorrelation of .7. Math score and Verbal score were correlated .2 and .4, 
respectively, with freshman GPA. All these assumptions are plausible. In this example, the correlation of 
the SAT I composite with freshman GPA is .325, while the multiple correlation obtained from the regression 
of GPA on SAT I Verbal and Math scores (considered as two predictors) is .415. 
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which corresponds to the one used in UC and the SAT , contains information on 77,893 
students admitted to seven campuses of the University of California (Berkeley, Davis, 
Los Angeles, Irvine, Riverside, San Diego, Santa Barbara) in 1996, 1997, 1998, and 
1999. UC Santa Cruz was excluded from the data base because it does not assign 
conventional grades, and UC San Francisco was excluded because it is solely a 
graduate institution. Data for UC Riverside were missing for 1997 and 1998 (as 
described in UC and the SAT). The variables contained in the data set include high 
school grade-point average (HSGPA), 4 freshman GPA at the University of California 
(UCGPA), admissions test scores, ethnicity, and, for 85 percent of the students (66,584 
of 77,893), data on parental income and education. The admissions test scores consist 
of SAT I Verbal and Math scores, SAT II Math and Writing scores, and the score on the 
third SAT II test chosen by the applicant. The identity of the third test is not included in 
the data set, however. 

The following additional data were requested from UCOP for use in the current project: 

• Newly available data on UC Riverside students that were used by UCOP to prepare 
the January 2002 report, Research Addendum: Additional Findings on UC and the 
SAT, by Saul Geiser and Roger Studley. 

• Data on students' gender, primary language, the identity of the "third" SAT II test 
submitted, and the multiple-choice and writing sample subscores of the SAT II. 

• Data on applicants, as well as enrolled students. 

The first of these requests was met, but the remaining ones were not. The Riverside 
data set that was supplied included data for all four cohorts (7,282 cases). Except 
where noted, this secondary data set was used as the source of Riverside data in the 
analyses reported here. The total number of cases analyzed in the current study (using 
the secondary data set for UC Riverside and the primary data set for the remaining six 
campuses) is 81 ,801 . Table 1 A shows the number of included students by campus and 
cohort; Table 1 B gives the number of students for whom parental income and education 
data were available. Within the primary and secondary data sets, all records have 
complete data on high school grades, admissions test scores, and UCGPA. 5 

Table 1C gives the admission rates for the campus-cohort combinations involved in this 
study, which are useful in interpreting the analysis results. UC Berkeley and UCLA are 
the most selective of the seven campuses, UC Riverside is least selective, and the 
remaining campuses occupy a middle ground. 



4 According to Geiser and Studley (2004, p. 127), “HSGPA used in this analysis is an honors-weighted GPA 
with additional grade-points for honors-level courses; HSGPA is uncapped and may exceed 4.0.” 

5 No information on the number of cases excluded by UCOP because of missing data on these variables 
was provided with the data set, nor does it appear in L/C and the SAT. In response to an inquiry, UCOP 
analysts reported that they excluded 10,528 additional cases because of incomplete data on grades or test 
scores (R. Studley, personal communication, February 21 , 2002). In our own study, 27 students in the 
secondary Riverside data set had to be excluded from analyses because their recorded UCGPAs were out 
of range; some were as high as 1 5. We later discovered that a total of six cases in the remaining 74,51 9 
records in the UCOP data set (.008%) also had out-of-range UCGPAs. Because the number was small and 
the out-of range UCGPAs were not extreme (ranging between 4.3 and 6.55), no analyses were redone. 
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4. ANALYSES 

The analyses conducted for this project are summarized in the next three subsections. 
Section 4.1 presents descriptive information for key variables. Section 4.2 gives results 
on the prediction of UCGPA within each campus and cohort for several different 
regression models, some of which contain parental income and education along with test 
scores and high school grades. Section 4.3 describes analyses that focus on the 
relative accuracy of the prediction of UCGPA for African-American, Asian-American, 
Latino, and White students. 

4. 1. Descriptive Results for Key Variables 

Tables 2A-2I give, for the four cohorts and seven campuses, the means and standard 
deviations of the key variables used in the study: high school GPA, SAT I Verbal score, 
SAT I Math score, SAT II Writing score, SAT II Math score, SAT II third test score, 
UCGPA, parent income, and parent education. Tables 2A-2G show that grades and test 
scores were typically highest at Berkeley and UCLA and lowest at Riverside. In general, 
the academic qualifications of students increased over time, particularly at UCSB. 
Parental income was extremely variable within each campus-cohort combination, but on 
average was highest at Santa Barbara and Berkeley (with mean income always 
exceeding $80,000 per year) and lowest by far at Riverside (never exceeding $57,000). 
Income (expressed here in constant 1998 dollars) increased over time at most 
campuses. The average number of years of parental education (for the student's more 
educated parent) exceeded 16 in all cohorts at Berkeley, UCSD, and UCSB. The 
average parental education was lowest at Riverside (14.6 years) and Irvine (15.5 years). 

4.2. Prediction of UCGPA Within Campuses and Cohorts 

To assure that we were working with exactly the same data set as Geiser and Studley, 
we first conducted analyses like those used to produce Tables 1 , 2, and 3 of UC and the 
SAT (using the primary data set originally provided to us). Our results were identical to 
those reported. Following that, we estimated a number of alternative regression models 
for each campus and cohort. In each case, the dependent variable was UCGPA. The 
predictors included in each of our ten primary models are summarized in Table 3. 

Table 4 gives the estimated squared correlations (R 2 )for Models 1-8, which do not 
include parental income and education. The medians of the R 2 values for each model 
appear at the foot of the table. For Models 6, 7, and 8, the R 2 values are also plotted in 
Figures 1-3. A comparison of these models shows that the SAT ll-HSGPA combination 
is slightly more effective in predicting UCGPA than is the SAT l-HSGPA combination 
(Model 6; R 2 = .168), explaining about one percent more of the variance in UCGPA when 
only the SAT II Math and Writing tests are included (Model 7; R 2 =.179), and two percent 
more when the third test is included as well (Model 8; R 2 =.186). 

While these results corroborate the findings of UC and the SAT that SAT II scores are 
superior to SAT I scores as predictors of UCGPA, the differences in the proportion of 
explained variance are tiny. Also, as discussed below, the predictive power of the SAT II 
is largely attributable to the SAT II Writing test. Using SAT I scores alone (without 
HSGPA) to predict UCGPA produces an R 2 of only .084 (Model 2). The R 2 values for 
the remaining models range between .110 and .126. It is worth noting that the summary 
values of R 2 in Table 4, which are the medians of the within-campus-cohort results, are 
all smaller than the (roughly) corresponding values in UC and the SAT (Table 1 , p. 3). 

For example, the R 2 for HSGPA reported in UC and the SAT, computed on the 
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combined data for all campuses and cohorts, is .154; the corresponding value in Table 4 
of the current report is .126. The likely reason is the type of confounding of effects 
discussed in Section 2.1 : The R 2 values computed on the combined data are inflated 
because campuses and cohorts with high values on the predictor variables also tend to 
have high freshman grades. 6 

Two other trends are evident in the results of Table 4 and Figures 1-3: First, the 
predictive value of each model varies considerably across campuses. For models 6-8, 
prediction tends to be weakest at UC San Diego and best at UC Davis. Second, there is 
a tendency for the R 2 values to decrease between 1996 and 1999, although this pattern 
is by no means consistent across the seven campuses. A possible reason for the 
variation in predictive effectiveness is that it results from differences in selectivity: UC 
San Diego is the third most selective school (after Berkeley and UCLA) and UC Davis is 
less selective (see Table 1C). In addition, admission rates generally fell between 1996 
and 1999 (Table 1C). It would seem reasonable to expect the greater selectivity in some 
campuses and cohorts to reduce the variability of academic performance variables and 
hence the correlations among them. It must be noted however, that the standard 
deviations reported in Tables 2A-2G do not seem to support this hypothesis. 

One of our analysis goals was to reexamine the effects of adding parental income 7 and 
education to academic variables as predictors of UCGPA. The conclusion in UC and the 
SATthat “much of the apparent relationship between the SAT I and UC freshman 
grades is conditioned by socioeconomic factors” (p. 9) was based on a regression 
analysis performed on combined data for all cohorts and campuses. First, Geiser and 
Studley fit a regression model in which FISGPA and test scores were used to predict 
UCGPA. Then, they added two more predictors to the model: parental income and 
education. They found that “the predictive weights for both the SAT II and HSGPA are 
undiminished (and in fact increase slightly). In contrast, the weight for the SAT I ... falls 
sharply” (p. 9). The standardized regression coefficients in question are shown in Table 
5, which is equivalent to Table 6 of the UCOP report (p. 8). (A more detailed version of 
this table appears as Table 7 of Geiser & Studley, 2004, p. 136.) We conducted 
analyses within campuses and cohorts, with each test component considered 
separately, to determine whether the pattern shown in Table 5 was evident. In fact, our 
results did not generally follow this pattern. 

Tables 6 and 7 give the estimated standardized regression coefficients, as well as the R 2 
values for Models 9 and 10, respectively. The last row gives the medians of the 
regression coefficients and R 2 values. Both models include high school GPA and all five 
test scores; Model 10 includes parental income and education as well. To facilitate 
comparison of the models, both were estimated using only those students for whom 
income and education data were available (see Table IB). 

A significant finding is that adding parental income and education increases the 
proportion of explained variance (R 2 ) by an average of only .006 (i.e., from .188 in Table 
6 to .194 in Table 7). Furthermore, on average, the coefficients of the individual 



6 The range restriction phenomenon described in Section 5.1 is also relevant. 

7 The analyses in L/C and the SAT used the log of parental income, and therefore we did so in all of our 
analyses as well. (Only the descriptive results in Table 2H are based on untransformed income. In 
economic analyses, the log transformation is commonly applied to income to make the distribution more 
symmetric.) 
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predictors changed by no more than .02. The average coefficients for the SAT I Verbal 
and SAT II Writing tests decreased trivially, while the average coefficient for HSGPA and 
for the SAT II third test increased trivially. For the remaining variables, the average 
coefficients stayed the same to two decimal places. These findings, then, are not 
consistent with the conclusion in UC and the SAT that the predictive power of the SAT I 
was more “sensitive” to socioeconomic factors than the predictive power of the SAT II. 
Adding parental income and education, in fact, had very little impact on the regression 
results, as reflected in the minimal changes in R 2 values within campuses and cohorts. 8 
There are several possible explanations for the difference between these conclusions 
and those of UC and the SAT. The most important is that, for reasons described earlier, 
the analyses in this study have been conducted within campus and cohort. Averaging 
the within-group analyses is a better way to assess the effectiveness of these variables 
in predicting UCGPA than is analyzing the combined data. Second, our analyses used 
each test score separately, rather than forming SAT I and SAT II composites. A third 
reason is that our analyses incorporated the updated data set from UC Riverside. The 
analyses of Tables 6 and 7 are therefore based on 70,610 cases (see Table IB), while 
the related analyses in UC and the SAT were based on 66,584 cases. 

Several other aspects of the results in Tables 6 and 7 are also noteworthy. First, the 
results show that, given the set of predictors considered here, high school grades are 
the single most effective predictor of UCGPA (i.e., its average standardized regression 
coefficient is highest), followed by the SAT II Writing Test. The SAT II third test also 
makes a contribution, although its effectiveness varies considerably across campuses 
and cohorts. The remaining test scores (SAT I Math and Verbal, SAT II Math) contribute 
little, given the predictors included in Models 9 and 10. 

4.3 Prediction Accuracy for Ethnic Groups 

An important goal of our investigation was to determine whether prediction of UCGPA 
was equally effective for all ethnic groups. One way we studied this was to combine the 
data for all ethnic groups, estimate a regression equation (to predict UCGPA from test 
scores and high school grades), and then examine the degree to which the use of this 
common equation produced predicted UCGPA values that tended to be too high 
("overprediction") or too low ("underprediction") for each group. In actual applications of 
regression analysis in college admissions, a single equation is typically derived for all 
students. If this equation yielded UCGPA predictions that were systematically "off" for a 
particular group, this result would be consistent with the definition of test bias articulated 
by Anne Cleary. Her definition states that a test is biased against a particular subgroup 
of test-takers "if the criterion score [in this case, UCGPA] predicted from the common 
regression line is consistently too high or too low for members of the subgroup" (Cleary, 
1968, p. 115). Researchers today would be more likely to use the term "prediction bias" 
rather than "test bias" because there are many possible reasons for errors in prediction. 
Because sample sizes for some ethnic groups were small, we conducted this analysis 
for two "mega-cohorts": the 1996 and 1997 class, combined, and the 1998 and 1999 
class, combined. (Recall that a major change in UC admissions policy — the elimination 
of affirmative action — went into effect beginning with the class of 1998.) We conducted 
the prediction accuracy analyses for African-American, Asian-American, Hispanic, and 



In a study of 1 993 graduates of public high schools who entered UC, Rothstein (2004) found that high- 
school-level socioeconomic variables did contribute substantially to the prediction of UCGPA. Data from all 
UC undergraduates (except those from Santa Cruz) were combined. 
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White students. The numbers of students in other ethnic groups, such as Native 
Americans, were too small to allow separate analysis. 

Tables 8-14 show the results of the prediction accuracy analyses for Models 6 and 7 for 
the two mega-cohorts at each of the seven campuses. As shown in Table 4, Model 6 
includes HSGPA and SAT I Verbal and Math scores; Model 7 includes HSGPA and SAT 
II Writing and Math scores. The main entries of Tables 8-14 give the difference, in 
grade-points, between the average observed and predicted UCGPA values (observed 
minus predicted) for each group in each analysis. It is a property of least squares 
regression that the sum of the prediction errors for all observations within a particular 
analysis will be zero, implying that underpredictions for some groups will be balanced by 
overpredictions for other groups. (The sum of the prediction errors shown in Tables 8-14 
is not zero because the results for the "other" group have been omitted from the 
displays.) The results show that prediction errors are quite similar for the SAT I and SAT 
II models, which is consistent with the conclusion reported in UC and the SAT (p. 13). In 
22 of 28 analyses (7 campuses x 2 mega-cohorts x 2 models), White students' UCGPAs 
were underpredicted by at least .05 (in grade-points), with one discrepancy reaching 
nearly .15 (UC Irvine, 1996-1997). Hispanic and Asian-American students were each 
overpredicted by an average of at least .05 in 12 of 28 analyses, and African-Americans 
were overpredicted by at least .05 in four analyses and underpredicted by at least .05 in 
four. Prediction errors over .05 were somewhat more likely to occur in 1997-1998 than 
in 1996-1997. Application of regression models that incorporated parental income and 
education (not shown) produced essentially the same pattern of prediction errors. 

Studies have often found that GPAs were overpredicted for Black and Hispanic test- 
takers and underpredicted for White and Asian-American test-takers (see Young, 2004 
for a review). There are a number of theories about the reasons for these pervasive 
patterns of prediction errors. Some have attributed the phenomenon to statistical 
artifacts; others believe they are related to the differing college experiences of various 
student groups. A brief overview of the most prevalent hypotheses appears here; see 
Zwick, 2002, pp. 117-124 for a more detailed review. One conjecture is that minority 
and White students are likely to differ in ways that are not fully captured by either their 
test scores or their high school grades. For example, a Black student and a White 
student who both have high school GPAs of 3.5, SAT Verbal scores of 600, and SAT 
Math scores of 650 may nevertheless differ in terms of factors like the quality of early 
schooling, the environment in the home, and the aspirations of the family, all of which 
can influence academic preparation. 

A related explanation for overprediction (Vars & Bowen, 1998) is that among qualified 
applicants, a smaller percentage of Whites than Blacks are admitted to college. White 
college students, according to this reasoning, are therefore more likely to have been 
selected using stringent (though perhaps informal) criteria that involve academic factors 
not captured by SAT scores (or, presumably, high school grades). In a similar vein, Linn 
(1983) laid out an explicit statistical model that shows how affirmative action polices 
could explain the phenomenon. A more controversial theory about overprediction is that 
college grades are biased against Blacks and other people of color, and tests are not. 
Under this scenario, raised by Klitgaard (1985), tests give a more accurate reading of 
students' capabilities than the subsequent evaluations of their academic performance. 

A technical explanation that has been offered repeatedly in the psychometric literature is 
that overprediction occurs because both SAT scores and high school grades are 



CSHE Research & Occasional Paper Series 




Zwick, Brown, and Sklar, CALIFORNIA AND THE SAT 



10 



imprecise measures of academic abilities. This unreliability can be shown to distort 
regression results in a way that produces overpredictions for lower-scoring groups and 
underpredictions for higher-scoring groups. 9 Seemingly, then, the imprecision of test 
scores and grades could explain overprediction for Blacks and Hispanics, and 
underprediction for Asian-Americans and Whites. But one major research finding argues 
against this technical factor as an all-purpose explanation: Female SAT-takers tend to 
score lower than male SAT-takers, yet their later grades tend to be underpredicted. In 
addition, the high reliability of the SAT (typically between .91 and .94 per section; see 
College Board and ETS, 1998, p. 29) suggests that the effects of test score imprecision 
on the regression results are likely to be small. 

Another category of hypotheses about overprediction is based on the assumption that 
when in college, minority students are not fulfilling their academic potential, which is 
assumed to be accurately captured by the tests. This “underperformance” could occur 
because of outright racism or because of a campus environment that is inhospitable to 
people of color, or it could be related to a greater occurrence among minority students of 
life difficulties, including financial problems, that interfere with academic performance. It 
has also been hypothesized that anxieties, low aspirations, or negative attitudes may 
interfere with the academic success of minority students, (e.g., see Bowen & Bok, 1998, 
p. 84; McWhorter, 2000). 

The "stereotype threat" theory of Steele and Aronson (1998) has been offered as 
another possible explanation for overprediction (e.g., Vars & Bowen, 1998, p. 475; 

Bowen & Bok, 1998, p. 81). Stereotype threat — "the threat of being viewed through the 
lens of a negative stereotype, or the fear of doing something that would inadvertently 
confirm that stereotype"-produces stress, which causes students to "learn to care less 
about the situations ... that bring it about" and, ultimately, to perform more poorly (Steele, 
August 1999, pp. 4, 5). In some circumstances, the researchers claim, merely asking 
test-takers to state their sex or ethnic group can be damaging to their performance. 
Steele and Aronson (1998) focused on the impact of stereotype threat on African- 
American students in testing situations, and concluded by saying that their "analysis 
uncovers a social and psychological predicament that is rife in the standardized testing 
environment ..." The goal of their studies, they said, was to "seek to explain why blacks 
underperform in college relative to equally well-prepared whites" (Steele & Aronson, 
1998, p. 425-426). 

Although the stereotype threat research is intriguing, it does not provide a 
straightforward explanation of the overprediction/underperformance phenomenon. If 
stereotype threat depressed standardized test performance, but didn't affect subsequent 
academic work, it would be expected to lead to underprediction because the affected 
students would perform better in college than their (depressed) test scores would 
indicate. To explain the existing pattern of test results and college grades, we'd have to 
hypothesize that stereotype threat had more effect on college grades than on 
admissions test performance, which seems contrary to the researchers' implication that 
standardized testing situations are particularly evocative of stereotype threat (Steele & 
Aronson, 1998; Steele, 1997). 



9 The effect is simplest to understand in the case of one predictor. Here, under typical assumptions about 
the nature of measurement errors in test scores, the effect of the measurement error on the regression 
analysis is to produce a regression line that is flatter (less steep) than the line that would theoretically be 
obtained with an error-free predictor (see Snedecor & Cochran, 1967, pp. 164-166). 
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For now, the perplexing overprediction phenomenon remains, at least in part, a mystery. 
Unmeasured differences between White students and Black and Hispanic students with 
the same test scores and previous grades certainly explain at least part of the 
overprediction. But it seems plausible that a greater incidence among minority students 
of life difficulties and financial problems in college contributes to the phenomenon as 
well. 

The finding of overprediction for Asian-American students in the current study is 
somewhat unusual. A recent finding by Zwick and Schlemer (2004) suggests that this 
result may be related to the language background of California students. In a study of 
two freshman cohorts (1997 and 1998) at UCSB, Zwick and Schlemer found accurate 
prediction or underprediction of freshman GPA for Asian-American students who said 
that their first language was English, and some evidence of overprediction for those who 
said their first language was “another language.” Overprediction was quite severe for a 
third group of Asian-Americans — those who said their first language was “English and 
another language.” Aggregating the three groups in the Zwick and Schlemer study (to 
maximize the comparability of the results with those of the current study) yields a finding 
of overprediction (.05 in 1997 and .06 in 1998) for a regression model containing 
HSGPA and SAT I Verbal and Math scores. 



5. SUMMARY AND DISCUSSION 

The current study shows that valuable information can be obscured when aggregated 
data are analyzed as in UC and the SAT. Our reanalyses of the UC admissions data 
revealed considerable variability across campuses and freshman cohorts in the 
predictive value of high school grades and test scores, which was masked in the 
analysis of the combined data. Also, our analyses within campuses and cohorts did not 
support the conclusion in UC and the SAT that SAT II scores are "less sensitive" to 
socioeconomic factors than SAT I scores, an assertion that was repeated countless 
times during the recent SAT debate in California. Our main findings were as follows: 

a. The SAT II is slightly superior to the SAT I as a predictor of UCGPA, although the 
pattern of results across campuses and cohorts is similar for the two tests. For each 
campus and cohort, several alternative prediction models were compared. The 
combination of SAT II Math and Writing scores and HSGPA explained about one percent 
more of the variance in UCGPA than the combination of SAT I Verbal and Math scores 
and HSGPA. When the SAT II third test was included along with the Math and Writing 
tests, the superiority of the SAT ll-HSGPA model over the SAT l-HSGPA model 
increased to two percent of the variance in UCGPA. 

b. Of the five test scores considered (two SAT I scores and three SAT II scores), the 
SAT II Writing test emerged as the best predictor of UCGPA in the analyses within 
campus and cohort. UC and the SAT also reported on the effectiveness of the SAT II 
Writing Test as a predictor (e.g., p. 19). As in most such studies, our analyses showed 
that HSGPA was the best single predictor of UCGPA. 

c. The degree to which UCGPA could be predicted by admissions tests and high school 
grades varied substantially over campuses and cohorts. In general, predictive 
effectiveness declined between 1996 and 1999. Although the results are not entirely 



CSHE Research & Occasional Paper Series 




Zwick, Brown, and Sklar, CALIFORNIA AND THE SAT 



12 



clear-cut, there is a tendency for predictive effectiveness to be smaller at the campuses 
with lower admission rates (Berkeley, UCLA, UCSD) than at the remaining campuses, 
and for predictive effectiveness to decrease between 1996 and 1999, during which time 
admission rates also tended to decrease. It would seem reasonable to expect the 
greater selectivity in some campuses and cohorts to reduce the variability of academic 
performance variables and hence the correlations among them. It must be noted 
however, that the standard deviations of test scores and grades for the students in this 
study do not seem to support this hypothesis. In general, all models in the current study 
appeared to be less effective in predicting UCGPA (i.e., produced smaller R 2 values) 
than the analogous models in L/C and the SAT. This disparity suggests that the findings 
on the aggregated data were inflated by the confounding of two kinds of effects: (1 ) the 
relationship of the various predictors with UCGPA in terms of the average values for the 
28 campus-cohort combinations (the “between” effect), and (2) the relationship of the 
various predictors with UCGPA for students within each of the campus-cohort 
combinations (the “within” effect). 

d. The predictive power of the SAT II was not found to be “less conditioned by 
socioeconomic factors’’ than the predictive validity of the SAT I . In L/C and the SAT, 
Geiser and Studley fit a regression model to the combined data in which HSGPA and 
test scores were used to predict UCGPA. When they subsequently added parental 
income and education to the model, they found the predictive weights for the SAT II and 
HSGPA to be undiminished, while the weight for the SAT I fell sharply (p. 9). This 
pattern of results led to the inference that “much of the apparent relationship between 
the SAT I and UC freshman grades is conditioned by socioeconomic factors” (p. 9). 

This, in turn, led to a conclusion that the SAT II was more fair. (As noted in footnote 3, 
Geiser and Studley moderated this conclusion somewhat in a later publication.) When 
we conducted analyses within campuses and cohorts, with each test component 
considered separately, however, we did not find the same pattern of results. In general, 
adding parental income and education to the model led to an increase in R 2 of only .004. 
Although slight changes in the regression coefficients occurred, as one would expect, 
the coefficients for the SAT I were not generally more affected by the introduction of the 
socioeconomic status variables than the coefficients for the SAT II. On average, the 
coefficients of the individual predictors changed by no more than .02. The average 
coefficients for the SAT I Verbal and SAT II Writing tests decreased trivially, while the 
average coefficient for HSGPA and for the SAT II third test increased trivially. For the 
remaining variables, the average coefficients stayed the same to two decimal places. 

e. Whether the SAT I or the SAT II was used in predicting UCGPA, White students’ 
grades were usually underpredicted by at least .05 (in grade points) when a common 
regression equation was used for all students, while Hispanic and Asian-American 
students’ grades were often overpredicted by at least that amount. Both underprediction 
and overprediction occurred for African Americans. Incorporating parental income and 
education into the prediction equation did not substantially alter the pattern of prediction 
errors. Overprediction of GPAs for African-American and Hispanic students has been 
common in past research, and has been attributed to a wide variety of factors ranging 
from incomplete specification of regression models and the unreliability of predictor 
variables to the particular challenges faced by students of color in the college 
environment. Overprediction of Asian-Americans’ GPAs is more unusual, and may be 
related to the language background of Asian-American students in California. Some 
recent research findings (Zwick & Schlemer, 2004) suggest that college grades may 
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tend to be underpredicted for Asian-Americans whose first language is English, while 
grades for other Asian-American students tend to be overpredicted. 

5. 1. Limitations and Future Research Plans 

This study was limited by the nature of the available data. For example, the data set did 
not include gender and primary language, which may substantially affect the prediction 
of UCGPA. Also absent from the data set was the identity of the SAT II third test taken 
by each student. In addition, because demographic information was not available on the 
10,528 cases that were excluded from the UCOP data set, it is impossible to determine 
the effect of their exclusion. Finally, the unavailability of applicant data made it 
impossible to consider the application of range restriction corrections. These 
adjustments are sometimes applied in admissions research to adjust for the restriction of 
range (of test scores and HSGPA, and, as a result, of college grades as well) that occurs 
because students whose SATs or high school grades are too low to allow admission will 
not have freshman GPAs. In general, range restriction curtails the size of the observed 
correlations between predictor variables and college GPA. The apparent association of 
test scores and HSGPA with college GPA is therefore smaller than it would be if all 
applicants could be considered (see Gulliksen, 1987, Chapter 13; Howell, 1997, pp. 266- 
267). Statistical procedures have been developed to estimate the correlations for the full 
population of applicants. These range restriction corrections are only approximate at 
best, however, because they rely on unrealistically simple assumptions about the 
selection process (see Rothstein, 2004, for an interesting consideration of this issue). 
Nevertheless, the corrections may facilitate comparisons among institutions or student 
groups that are affected to varying degrees by range restriction (Ramist et al. , 1994, p. 

5; see also Kobrin, Camara, & Milewski, 2004). 10 

A subsequent article will describe additional analyses of the UC data using hierarchical 
linear modeling. This approach can better reflect the structure of the data, in which 
students are “nested” within campuses and cohorts. The first level of the model 
describes the regression of students’ UCGPAs on a set of predictors within the 28 
campus-cohort combinations. The second level of the model describes the dependence 
of the level-1 regression coefficients on predictors that are defined at the campus-cohort 
level, such as admission rate and average test score. 
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TABLES 



Table 1A 

Number of Students in Each Cohort and Campus 



Year 


UCB 


UCD 


UCI 


UCLA 


UCR 


UCSD 


UCSB 


Total 


1996 


3461 


3204 


3009 


3595 


1178 


2387 


3142 


19976 


1997 


2830 


2271 


1859 


2949 


1736 


2464 


2221 


16330 


1998 


3579 


3333 


2964 


4058 


1951 


3181 


3349 


22415 


1999 


3490 


3108 


3537 


4034 


2417 


2925 


3569 


23080 


Total 


13360 


11916 


11369 


14636 


7282 


10957 


12281 


81801 



Table IB 

Number of Students Providing Income and Parental Education Information 



Year 


UCB 


UCD 


UCI 


UCLA 


UCR 


UCSD 


UCSB 


Total 


1996 


2950 


2881 


2769 


3204 


1152 


2131 


2803 


17890 


1997 


2472 


2057 


1714 


2609 


1688 


2202 


1953 


14695 


1998 


2765 


2806 


2491 


3304 


1852 


2565 


2695 


18478 


1999 


2815 


2597 


3065 


3356 


2339 


2445 


2930 


19547 


Total 


11002 


10341 


10039 


12473 


7031 


9343 


10381 


70610 



CSHE Research & Occasional Paper Series 






Zwick, Brown, and Sklar, CALIFORNIA AND THE SAT 



Table 1C 

Admission Rates for Each Cohort and Campus 





Applications 


Admissions 


Admission Rates 


UCB 








1996 


6242 


2081 


0.33 


1997 


5937 


2211 


0.37 


1998 


5987 


2108 


0.35 


1999 


6319 


2036 


0.32 


UCD 








1996 


4641 


3251 


0.70 


1997 


4463 


3076 


0.69 


1998 


4241 


2895 


0.68 


1999 


4474 


2842 


0.64 


UCI 








1996 


4190 


2566 


0.61 


1997 


3778 


2152 


0.57 


1998 


3752 


2028 


0.54 


1999 


4170 


2467 


0.59 


UCLA 








1996 


7760 


3531 


0.46 


1997 


7052 


3049 


0.43 


1998 


7505 


2840 


0.39 


1999 


8488 


3285 


0.39 


UCR 








1996 


2814 


2289 


0.81 


1997 


2407 


2148 


0.89 


1998 


2503 


2034 


0.81 


1999 


2927 


2431 


0.83 


UCSD 








1996 


4050 


2337 


0.58 


1997 


3951 


2399 


0.61 


1998 


4187 


2308 


0.55 


1999 


4680 


2569 


0.55 


UCSB 








1996 


4891 


3588 


0.73 


1997 


4485 


3208 


0.72 


1998 


4751 


3281 


0.69 


1999 


5330 


3550 


0.67 
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Table 2A 

Means and Standard Deviations (parenthesized) for HSGPA 



Year 


UCB 


UCD 


UCI 


UCLA 


UCR 


UCSD 


UCSB 


1996 


4.01 


3.76 


3.67 


3.99 


3.53 


3.95 


3.52 




(.42) 


(-37) 


(-37) 


(-38) 


(.44) 


(-32) 


(-39) 


1997 


4.10 


3.79 


3.73 


4.08 


3.48 


3.96 


3.59 




(-37) 


(-38) 


(-35) 


(-33) 


(.43) 


(-31) 


(-39) 


1998 


4.12 


3.76 


3.73 


4.09 


3.51 


3.94 


3.66 




(-38) 


(-36) 


(-32) 


(-35) 


(-39) 


(-32) 


(-37) 


1999 


4.16 


3.75 


3.62 


3.70 


3.53 


4.03 


3.72 




(-39) 


(-37) 


(-43) 


(-27) 


(.41) 


(-28) 


(-38) 



Table 2B 

Means and Standard Deviations (parenthesized) for SAT I Verbal Score 



Year 


UCB 


UCD 


UCI 


UCLA 


UCR 


UCSD 


UCSB 


1996 


634.3 

(89.0) 


564.4 

(86.6) 


528.7 

(87.5) 


604.3 

(80.5) 


512.0 

(93.2) 


593.8 

(82.6) 


550.1 

(80.3) 


1997 


651.4 

(87.1) 


569.9 

(89.4) 


537.4 

(85.1) 


615.1 

(76.0) 


511.5 

(97.9) 


603.9 

(77.3) 


572.7 

(82.4) 


1998 


651.5 

(87.3) 


557.2 

(91.1) 


537.9 

(80.4) 


622.2 

(77.0) 


512.9 

(93.9) 


603.1 

(78.3) 


570.9 

(78.8) 


1999 


639.6 

(94.1) 


563.7 

(92.8) 


557.7 

(76.3) 


620.7 

(80.8) 


507.7 

(96.0) 


608.9 

(82.7) 


578.9 

(81.2) 



Table 2C 

Means and Standard Deviations (parenthesized) for SAT I Math Score 



Year 


UCB 


UCD 


UCI 


UCLA 


UCR 


UCSD 


UCSB 


1996 


666.6 

(91.2) 


603.3 

(75.8) 


593.6 

(83.7) 


635.1 

(84.4) 


550.6 

(94.8) 


634.0 

(74.8) 


569.9 

(79.4) 


1997 


687.4 

(83.6) 


614.5 

(74.8) 


599.8 

(79.8) 


655.6 

(74.9) 


561.2 

(95.3) 


646.5 

(73.1) 


598.2 

(76.5) 


1998 


685.5 

(79.9) 


596.2 

(80.6) 


593.1 

(84.0) 


654.3 

(79.4) 


557.8 

(91.2) 


641.8 

(72.7) 


594.2 

(77.8) 


1999 


670.8 

(89.0) 


601.8 

(82.6) 


603.5 

(82.2) 


655.5 

(85.2) 


552.5 

(95.0) 


648.2 

(74.1) 


602.5 

(79.7) 
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Table 2D 

Means and Standard Deviations (parenthesized) for SAT II Writing Score 



Year 


UCB 


UCD 


UCI 


UCLA 


UCR 


UCSD 


UCSB 


1996 


609.9 

(93.1) 


537.7 

(86.3) 


500.0 

(81.7) 


577.9 

(87.6) 


483.9 

(84.8) 


567.3 

(84.7) 


523.0 

(81.9) 


1997 


632.4 

(91.1) 


545.7 

(86.4) 


507.8 

(84.2) 


598.2 

(79.9) 


484.1 

(89.4) 


579.6 

(78.9) 


542.2 

(82.0) 


1998 


630.0 

(92.3) 


529.8 

(91.4) 


507.1 

(80.9) 


601.1 

(86.6) 


482.8 

(87.0) 


574.4 

(83.6) 


540.2 

(82.8) 


1999 


641.5 

(100.6) 


557.8 

(94.2) 


543.4 

(82.4) 


622.5 

(91.0) 


500.6 

(86.5) 


605.4 

(88.5) 


571.5 

(85.8) 



Table 2E 

Means and Standard Deviations (parenthesized) for SAT II Math Score 



Year 


UCB 


UCD 


UCI 


UCLA 


UCR 


UCSD 


UCSB 


1996 


652.3 

(95.5) 


588.6 

(79.8) 


577. 1 
(84.0) 


618.7 

(89.5) 


532.7 

(93.6) 


618.3 

(77.2) 


547.1 

(81.9) 


1997 


670.8 

(92.3) 


594.9 

(79.5) 


575.8 

(84.3) 


634.0 

(81.1) 


544.2 

(94.5) 


623.3 

(77.2) 


571.7 

(82.1) 


1998 


672.3 

(88.1) 


579.2 

(85.9) 


573.3 

(86.5) 


635.5 

(87.1) 


537.3 

(89.3) 


624.5 

(78.1) 


570.7 

(82.4) 


1999 


664.3 

(98.0) 


588.1 

(87.4) 


587.9 

(88.5) 


643.6 

(91.8) 


538.7 

(91.2) 


636.9 

(79.8) 


583.4 

(85.0) 



Table 2F 

Means and Standard Deviations (parenthesized) for SAT II Third Test Score 



Year 


UCB 


UCD 


UCI 


UCLA 


UCR 


UCSD 


UCSB 


1996 


661.7 

(97.6) 


585.2 

(99.4) 


572.2 

(112.7) 


627.1 

(94.7) 


547.9 

(117.9) 


605.9 

(92.1) 


549.7 

(98.5) 


1997 


682.1 

(89.3) 


598.2 

(97.3) 


576.1 

(108.6) 


641.5 

(85.5) 


553.6 

(118.4) 


617.7 

(86.1) 


577.5 

(93.9) 


1998 


677.8 

(90.9) 


584.9 

(106.4) 


577.9 

(109.7) 


646.6 

(88.4) 


561.2 

(117.0) 


616.5 

(91.3) 


581.5 

(97.4) 


1999 


671.5 

(98.2) 


588.8 

(106.8) 


585.5 

(109.4) 


644.7 

(96.0) 


557.9 

(123.3) 


643.1 

(93.8) 


586.9 

(98.2) 
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Table 2G 

Means and Standard Deviations (parenthesized) for UCGPA 



Year 


UCB 


UCD 


UCI 


UCLA 


UCR 


UCSD 


UCSB 


1996 


3.06 


2.78 


2.75 


3.00 


2.51 


2.93 


2.78 




(-62) 


(-64) 


(-62) 


(-59) 


(.78) 


(-58) 


(-58) 


1997 


3.16 


2.84 


2.87 


3.08 


2.51 


2.97 


2.88 




(-58) 


(-62) 


(-57) 


(-53) 


(.75) 


(-53) 


(-57) 


1998 


3.14 


2.79 


2.81 


3.10 


2.48 


2.96 


2.92 




(-58) 


(-64) 


(-59) 


(-55) 


(.73) 


(-56) 


(-56) 


1999 


3.10 


2.73 


2.74 


3.09 


2.52 


3.00 


2.92 




(-61) 


(-68) 


(-63) 


(-58) 


(-81) 


(-55) 


(-59) 



Table 2H 

Means and Standard Deviations (parenthesized) for Income (in dollars) 



Year 


UCB 


UCD 


UCI 


UCLA 


UCR 


UCSD 


UCSB 


1996 


82,360 

(81,736) 


74,908 

(67,125) 


61,443 

(58,025) 


74,995 

(72,788) 


56,687 

(71,873) 


79,903 

(66,431) 


81,850 

(71,220) 


1997 


89,518 

(94,649) 


78,419 

(69,501) 


62,952 

(61,525) 


81,682 

(76,267) 


54,218 

(64,054) 


89,766 

(86,282) 


90,378 

(75,564) 


1998 


86,120 

(81,746) 


72,396 

(64,608) 


65,975 

(63,559) 


78,848 

(79,860) 


48,681 

(52,464) 


82,299 

(76,495) 


87,901 

(71,518) 


1999 


82,100 

(79,311) 


77,232 

(69,975) 


70,355 

(68,879) 


80,436 

(82,792) 


53,119 

(59,135) 


81,422 

(74,706) 


86,068 

(74,431) 



Table 21 

Means and Standard Deviations (parenthesized) for Parent Education 



Year 


UCB 


UCD 


UCI 


UCLA 


UCR 


UCSD 


UCSB 


1996 


16.3 


15.7 


15.2 


15.7 


14.7 


16.3 


16.2 




(3.1) 


(3.4) 


(3.2) 


(3.4) 


(3.5) 


(3-0) 


(3.0) 


1997 


16.6 


16.0 


15.3 


16.3 


14.6 


16.5 


16.5 




(3.1) 


(3.3) 


(3.2) 


(3.2) 


(3.5) 


(2.9) 


(2.9) 


1998 


16.6 


15.4 


15.2 


16.0 


14.6 


16.3 


16.3 




(3.0) 


(3.4) 


(3.2) 


(3.2) 


(3.5) 


(2.9) 


(3.0) 


1999 


16.1 


15.5 


15.5 


15.9 


14.6 


16.1 


16.2 




(3.4) 


(3.5) 


(3.1) 


(3.3) 


(3.4) 


(3.1) 


(3.1) 
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Table 3 

Predictors Used in the Initial Set of Regression Models 



Model 


HS GPA 


SAT 1: 
Verbal 


SAT 1: 
Math 


SAT II: 
Writing 


SAT II: 
Math 


SAT II: 
Third 


Parent 

income 


Parent 

education 


1 


X 
















2 




X 


X 












3 








X 


X 








4 








X 


X 


X 






5 




X 


X 


X 


X 


X 






6 


X 


X 


X 












7 


X 






X 


X 








8 


X 






X 


X 


X 






9 


X 


X 


X 


X 


X 


X 






10 


X 


X 


X 


X 


X 


X 


X 


X 
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Table 4 

R 2 Values for Each Campus and Cohort - Models 1-8 





Model 1 


Model 2 


Model 3 


Model 4 


Model 5 


Model 6 


Model 7 


Model 8 


UCB 


















1996 


0.122 


0.090 


0.118 


0.132 


0.132 


0.149 


0.166 


0.174 


1997 


0.135 


0.088 


0.117 


0.127 


0.129 


0.153 


0.173 


0.178 


1998 


0.085 


0.071 


0.098 


0.111 


0.112 


0.112 


0.129 


0.138 


1999 


0.097 


0.112 


0.137 


0.147 


0.147 


0.146 


0.163 


0.171 


UCD 


















1996 


0.129 


0.092 


0.121 


0.139 


0.139 


0.191 


0.209 


0.224 


1997 


0.162 


0.067 


0.095 


0.127 


0.127 


0.197 


0.209 


0.237 


1998 


0.145 


0.139 


0.167 


0.200 


0.201 


0.228 


0.244 


0.269 


1999 


0.136 


0.083 


0.112 


0.130 


0.130 


0.190 


0.202 


0.218 


UCI 


















1996 


0.114 


0.076 


0.110 


0.117 


0.120 


0.180 


0.199 


0.205 


1997 


0.148 


0.132 


0.141 


0.163 


0.170 


0.225 


0.225 


0.243 


1998 


0.073 


0.092 


0.119 


0.128 


0.129 


0.139 


0.158 


0.166 


1999 


0.049 


0.080 


0.096 


0.103 


0.106 


0.117 


0.130 


0.137 


UCLA 


















1996 


0.167 


0.138 


0.152 


0.163 


0.168 


0.215 


0.224 


0.231 


1997 


0.113 


0.085 


0.110 


0.122 


0.125 


0.159 


0.179 


0.187 


1998 


0.124 


0.092 


0.112 


0.128 


0.129 


0.167 


0.179 


0.191 


1999 


0.096 


0.100 


0.123 


0.132 


0.134 


0.159 


0.175 


0.181 


UCR 


















1996 


0.197 


0.134 


0.142 


0.153 


0.159 


0.244 


0.241 


0.247 


1997 


0.127 


0.103 


0.108 


0.111 


0.119 


0.187 


0.183 


0.185 


1998 


0.133 


0.053 


0.072 


0.077 


0.080 


0.169 


0.173 


0.176 


1999 


0.113 


0.064 


0.076 


0.079 


0.081 


0.157 


0.155 


0.158 


UCSD 


















1996 


0.078 


0.065 


0.084 


0.094 


0.094 


0.131 


0.146 


0.155 


1997 


0.119 


0.043 


0.065 


0.082 


0.084 


0.152 


0.168 


0.182 


1998 


0.096 


0.051 


0.070 


0.088 


0.089 


0.145 


0.159 


0.174 


1999 


0.068 


0.074 


0.097 


0.105 


0.105 


0.140 


0.153 


0.163 


UCSB 


















1996 


0.168 


0.118 


0.132 


0.139 


0.143 


0.242 


0.244 


0.249 


1997 


0.165 


0.068 


0.096 


0.103 


0.104 


0.217 


0.229 


0.233 


1998 


0.153 


0.073 


0.093 


0.099 


0.102 


0.196 


0.207 


0.213 


1999 


0.166 


0.083 


0.101 


0.102 


0.107 


0.219 


0.223 


0.225 


Median 


0.126 


0.084 


0.110 


0.125 


0.126 


0.168 


0.179 


0.186 
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Table 5 

Standardized Regression Coefficients from Geiser and Studley for Prediction Models 
With and Without Parental Income and Education (for all UC Campuses, 1996-1999) 



Regression Model: 


HS GPA 


SAT 1 


SAT II 


Parent 

Income 


Parent 

Education 


Without Income and 
Education 


.27 


.07 


.23 


X 


X 


With Income and 
Education 


.28 


.02 


.24 


.03 


.06 



Note: This table contains the same information as Table 6 of Geiser & Studley (2001 , 
October, p. 8). 
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Table 6 

Regression Results for Each Campus and Cohort - Model 9 





HS GPA 


SAT 1 
Verbal 


SAT 1 
Math 


SAT II 
Writing 


SAT II 
Math 


SAT II 
Third 


R z 


UCB 
















1996 


.25* 


-.01 


-.04 


.21* 


-.01 


.12* 


.176 


1997 


.27* 


-.05 


.01 


.21* 


-.03 


.09* 


.174 


1998 


.20* 


-.01 


-.10* 


.18* 


.08 


.13* 


.145 


1999 


.19* 


.02 


-.03 


.20* 


.05 


.11* 


.174 


UCD 
















1996 


.31* 


.00 


.05 


.20* 


.02 


.14* 


.229 


1997 


.34* 


.01 


.01 


.13* 


.02 


.18* 


.229 


1998 


.28* 


.03 


-.01 


.20* 


.06 


.18* 


.270 


1999 


.30* 


.05 


.04 


.15* 


.00 


.13* 


.215 


UCI 
















1996 


.30* 


.08* 


-.05 


.19* 


.12* 


.07* 


.203 


1997 


.30* 


.13* 


.02 


.11* 


.04 


.14* 


.254 


1998 


.20* 


.05 


-.01 


.18* 


.09* 


.10* 


.167 


1999 


.19* 


.06 


-.02 


.15* 


.11* 


.09* 


.137 


UCLA 
















1996 


.29* 


.05 


.06 


.15* 


.03 


.08* 


.235 


1997 


.27* 


.03 


.02 


.21* 


-.02 


.10* 


.188 


1998 


.27* 


.04 


.04 


.16* 


-.02 


.11* 


.186 


1999 


.22* 


o 

CD 

sf- 


-.01 


.18* 


.01 


.07* 


.174 


UCR 
















1996 


.34* 


.04 


.14* 


.15* 


-.06 


.06 


.263 


1997 


.29* 


.13* 


.08 


.10* 


-.03 


.04 


.192 


1998 


.34* 


.10* 


-.02 


.14* 


-.02 


.04 


.188 


1999 


.30* 


.07 


.10* 


.11* 


-.04 


.05 


.166 


UCSD 
















1996 


.25* 


.04 


-.04 


.12* 


.15* 


.10* 


.157 


1997 


.32* 


.01 


-.05 


.15* 


.09* 


.13* 


.183 


1998 


.29* 


.01 


-.03 


.18* 


.07 


.14* 


.176 


1999 


.25* 


.01 


.09* 


.18* 


.03 


.11* 


.160 


UCSB 
















1996 


.35* 


.11* 


.05 


.14* 


.03 


.07* 


.265 


1997 


.36* 


.06 


.04 


.17* 


.03 


.07* 


.240 


1998 


.34* 


.07* 


.01 


.17* 


.01 


.06* 


.217 


1999 


.36* 


.13* 


-.03 


.13* 


.05 


.02 


.234 


Median 


.29 


.05 


.01 


.17 


.03 


.10 


.188 



* 



Regression coefficient is statistica 



ly significant at a=.01. 



CSHE Research & Occasional Paper Series 




Zwick, Brown, and Sklar, CALIFORNIA AND THE SAT 



25 



Table 7 

Regression Results for Each Campus and Cohort - Model 10 





HS 

GPA 


SAT 1 
Verbal 


SAT 1 
Math 


SAT II 
Writing 


SAT II 
Math 


SAT II 
Third 


Parent 

income 


Parent 

Education 


R z 


UCB 




















1996 


.25* 


-.02 


-.05 


.20* 


-.01 


.13* 


.02 


.03 


.178 


1997 


.27* 


-.06 


-.00 


.19* 


-.03 


.11* 


.02 


.06 


.178 


1998 


.21* 


-.03 


.11* 


.16* 


.07 


.14* 


.06* 


.06 


.153 


1999 


.20* 


-.02 


.05 


.17* 


.05 


.13* 


.06* 


.07* 


.183 


UCD 




















1996 


.31* 


-.02 


.04 


.19* 


.02 


.15* 


.00 


.06* 


.232 


1997 


.35* 


-.00 


.01 


.12* 


.02 


.19* 


.01 


.05 


.232 


1998 


.29* 


.01 


-.02 


.19* 


.06 


.19* 


.06* 


.02 


.274 


1999 


.30* 


.04 


.03 


.14* 


.00 


.14* 


.01 


.04 


.217 


UCI 




















1996 


.30* 


.07* 


-.06 


.18* 


.12* 


.08* 


-.01 


.04 


.204 


1997 


.30* 


.12* 


.02 


.10* 


.04 


.15* 


-.01 


.04 


.255 


1998 


.21* 


.05 


-.02 


.18* 


.09* 


.10* 


-.01 


.04 


.168 


1999 


.19* 


.05 


-.03 


.14* 


.11* 


.09* 


.01 


.06* 


.140 


UCLA 




















1996 


.29* 


.03 


.03 


.13* 


.03 


.10* 


.01 


.09* 


.241 


1997 


.28* 


.01 


.01 


.19* 


-.02 


.11* 


.06* 


.05 


.195 


1998 


.28* 


.01 


.03 


.14* 


-.02 


.13* 


.06* 


.04 


.192 


1999 


.23* 


.07* 


-.02 


.16* 


.01 


.09* 


.03 


.08* 


.181 


UCR 




















1996 


.35* 


.03 


.14* 


.14* 


-.06 


.06 


.01 


.01 


.263 


1997 


.29* 


.11* 


.07 


.09 


-.03 


.05 


.04 


.03 


.195 


1998 


.34* 


.09 


-.03 


.13* 


-.02 


.05 


.06 


.01 


.192 


1999 


.30* 


.07 


.11* 


.12* 


-.04 


.04 


.01 


-.03 


.167 


UCSD 




















1996 


.26* 


.02 


-.05 


.10* 


.14* 


.11* 


.02 


.10* 


.166 


1997 


.32* 


-.01 


-.06 


.14* 


.08* 


.13* 


.03 


.06 


.188 


1998 


.30* 


-.01 


-.04 


.16* 


.06 


.15* 


.04 


.05 


.180 


1999 


.24* 


-.02 


.07 


.15* 


.03 


.12* 


.07* 


.07* 


.171 


UCSB 




















1996 


.35* 


.09* 


.03 


.13* 


.04 


.08* 


.03 


.05 


.269 


1997 


.36* 


.04 


.02 


.15* 


.03 


.08* 


.02 


.06 


.244 


1998 


.34* 


.05 


-.02 


.15* 


.01 


.08* 


.04 


.09* 


.227 


1999 


.36* 


.12* 


-.05 


.11* 


.05 


.04 


.02 


.06* 


.238 


Median 


.30 


.03 


.01 


.15 


.03 


.11 


.02 


.05 


.194 



* Regression coefficient is statistically significant at a=.01. 
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Table 8A 

Prediction Accuracy Results for UC Berkeley: Model 6 



1996-1997 


1998-1999 


Ethnicity 


n 


Observed minus 
predicted UCGPA 


Ethnicity 


n 


Observed minus 
predicted UCGPA 


African 

American 


321 


-0.0253 


African 

American 


206 


0.0165 


Asian 

American 


2779 


-0.0459 


Asian American 


3207 


-0.0415 


Hispanic 


810 


-0.0459 


Hispanic 


570 


-0.1054 


White 


1878 


0.0857 


White 


2124 


0.0727 



Table 8B 

Prediction Accuracy Results for UC Berkeley: Model 7 



1996-1997 


1998-1999 


Ethnicity 


n 


Observed minus 
predicted UCGPA 


Ethnicity 


n 


Observed minus 
predicted UCGPA 


African 

American 


321 


-0.0223 


African 

American 


206 


0.0240 


Asian 

American 


2779 


-0.0391 


Asian American 


3207 


-0.0436 


Hispanic 


810 


-0.0344 


Hispanic 


570 


-0.0941 


White 


1878 


0.0722 


White 


2124 


0.0710 
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Table 9A 

Prediction Accuracy Results for UC Davis: Model 6 



1996-1997 


1998-1999 


Ethnicity 


n 


Observed minus 
predicted UCGPA 


Ethnicity 


n 


Observed minus 
predicted UCGPA 


African 

American 


109 


0.0008 


African 

American 


160 


0.0811 


Asian 

American 


2196 


-0.0757 


Asian American 


2532 


-0.0663 


Hispanic 


443 


0.0390 


Hispanic 


608 


0.0023 


White 


2358 


0.0616 


White 


2440 


0.0558 



Table 9B 

Prediction Accuracy Results for UC Davis: Model 7 



1996-1997 


1998-1999 


Ethnicity 


n 


Observed minus 
predicted UCGPA 


Ethnicity 


n 


Observed minus 
predicted UCGPA 


African 

American 


109 


-0.0069 


African 

American 


160 


0.0846 


Asian 

American 


2196 


-0.0648 


Asian American 


2532 


-0.0682 


Hispanic 


443 


0.0343 


Hispanic 


608 


-0.0014 


White 


2358 


0.0545 


White 


2440 


0.0569 
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Table 10A 

Prediction Accuracy Results for UC Irvine: Model 6 



1996-1997 


1998-1999 


Ethnicity 


n 


Observed minus 
predicted UCGPA 


Ethnicity 


n 


Observed minus 
predicted UCGPA 


African 

American 


72 


-0.0175 


African 

American 


133 


-0.0232 


Asian 

American 


3219 


-0.0429 


Asian American 


3662 


-0.0452 


Hispanic 


428 


-0.0103 


Hispanic 


717 


-0.0367 


White 


893 


0.1466 


White 


1318 


0.1200 



Table 10B 

Prediction Accuracy Results for UC Irvine: Model 7 



1996-1997 


1998-1999 


Ethnicity 


n 


Observed minus 
predicted UCGPA 


Ethnicity 


n 


Observed minus 
predicted UCGPA 


African 

American 


72 


-0.0317 


African 

American 


133 


-0.0115 


Asian 

American 


3219 


-0.0392 


Asian American 


3662 


-0.0435 


Hispanic 


428 


-0.0122 


Hispanic 


717 


-0.0350 


White 


893 


0.1366 


White 


1318 


0.1127 



CSHE Research & Occasional Paper Series 






Zwick, Brown, and Sklar, CALIFORNIA AND THE SAT 



29 



Table 11A 

Prediction Accuracy Results for UCLA: Model 6 



1996-1997 


1998-1999 


Ethnicity 


n 


Observed minus 
predicted UCGPA 


Ethnicity 


n 


Observed minus 
predicted UCGPA 


African 

American 


319 


-0.0338 


African 

American 


279 


-0.0919 


Asian 

American 


2622 


-0.0805 


Asian American 


3293 


-0.0602 


Hispanic 


1004 


-0.0496 


Hispanic 


898 


-0.0935 


White 


2129 


0.1120 


White 


2582 


0.1073 



Table 11B 

Prediction Accuracy Results for UCLA: Model 7 



1996-1997 


1998-1999 


Ethnicity 


n 


Observed minus 
predicted UCGPA 


Ethnicity 


n 


Observed minus 
predicted UCGPA 


African 

American 


319 


-0.0409 


African 

American 


279 


-0.0772 


Asian 

American 


2622 


-0.0680 


Asian American 


3293 


-0.0545 


Hispanic 


1004 


-0.0509 


Hispanic 


898 


-0.0899 


White 


2129 


0.0994 


White 


2582 


0.0970 
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Table 12A 

Prediction Accuracy Results for UC Riverside: Model 6 



1996-1997 


1998-1999 


Ethnicity 


n 


Observed minus 
predicted 
UCGPA 


Ethnicity 


n 


Observed minus 
predicted 
UCGPA 


African 

American 


110 


0.0583 


African 

American 


219 


-0.0003 


Asian American 


1506 


-0.0346 


Asian American 


2061 


-0.0266 


Hispanic 


486 


-0.0146 


Hispanic 


856 


0.0022 


White 


660 


0.0796 


White 


844 


0.0609 



Table 12B 

Prediction Accuracy Results for UC Riverside: Model 7 



1996-1997 


1998-1999 


Ethnicity 


n 


Observed minus 
predicted 
UCGPA 


Ethnicity 


n 


Observed minus 
predicted 
UCGPA 


African 

American 


110 


0.0471 


African 

American 


219 


-0.0091 


Asian American 


1506 


-0.0257 


Asian American 


2061 


-0.0216 


Hispanic 


486 


-0.0353 


Hispanic 


856 


-0.0069 


White 


660 


0.0751 


White 


844 


0.0561 
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Table 13A 

Prediction Accuracy Results for UC San Diego: Model 6 



1996-1997 


1998-1999 


Ethnicity 


n 


Observed minus 
predicted UCGPA 


Ethnicity 


n 


Observed minus 
predicted UCGPA 


African 

American 


83 


-0.1021 


African 

American 


72 


0.0432 


Asian 

American 


1953 


-0.0321 


Asian American 


2433 


-0.0460 


Hispanic 


416 


-0.0713 


Hispanic 


503 


-0.0697 


White 


2022 


0.0409 


White 


2309 


0.0585 



Table 13B 

Prediction Accuracy Results for UC San Diego: Model 7 



1996-1997 


1998-1999 


Ethnicity 


n 


Observed minus 
predicted UCGPA 


Ethnicity 


n 


Observed minus 
predicted UCGPA 


African 

American 


83 


-0.0782 


African 

American 


72 


0.0530 


Asian 

American 


1953 


-0.0305 


Asian American 


2433 


-0.0463 


Hispanic 


416 


-0.0555 


Hispanic 


503 


-0.0651 


White 


2022 


0.0349 


White 


2309 


0.0566 
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Table 14A 

Prediction Accuracy Results for UC Santa Barbara: Model 6 



1996-1997 


1998-1999 


Ethnicity 


n 


Observed minus 
predicted UCGPA 


Ethnicity 


n 


Observed minus 
predicted UCGPA 


African 

American 


135 


-0.0042 


African 

American 


173 


-0.0065 


Asian 

American 


948 


-0.0965 


Asian American 


1048 


-0.0920 


Hispanic 


663 


-0.0432 


Hispanic 


950 


-0.0828 


White 


3349 


0.0353 


White 


3811 


0.0366 



Table 14B 

Prediction Accuracy Results for UC Santa Barbara: Model 7 



1996-1997 


1998-1999 


Ethnicity 


n 


Observed minus 
predicted UCGPA 


Ethnicity 


n 


Observed minus 
predicted UCGPA 


African 

American 


135 


-0.0233 


African 

American 


173 


-0.0138 


Asian 

American 


948 


-0.0904 


Asian American 


1048 


-0.0942 


Hispanic 


663 


-0.0558 


Hispanic 


950 


-0.0819 


White 


3349 


0.0362 


White 


3811 


0.0348 
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Figure 1 (Model 6) 



Regression Results: Using High School 
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Figure 2 (Model 7) 



Regression Results: Using High School GPA, 




CSHE Research & Occasional Paper Series 





Percent of UC GPA Variance Explained 



Zwick, Brown, and Sklar, CALIFORNIA AND THE SAT 



35 



Figure 3 (Model 8) 

Regression Results: Using High School 
GPA, SAT II Math, SAT II Writing, & SAT II Third Test to 
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